Search CORE

30 research outputs found

Macroservers: An Execution Model for DRAM Processor-In-Memory Arrays

Author: Sterling Thomas L.
Zima Hans P.
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2000
Field of study

The emergence of semiconductor fabrication technology allowing a tight coupling between high-density DRAM and CMOS logic on the same chip has led to the important new class of Processor-In-Memory (PIM) architectures. Newer developments provide powerful parallel processing capabilities on the chip, exploiting the facility to load wide words in single memory accesses and supporting complex address manipulations in the memory. Furthermore, large arrays of PIMs can be arranged into a massively parallel architecture. In this report, we describe an object-based programming model based on the notion of a macroserver. Macroservers encapsulate a set of variables and methods; threads, spawned by the activation of methods, operate asynchronously on the variables' state space. Data distributions provide a mechanism for mapping large data structures across the memory region of a macroserver, while work distributions allow explicit control of bindings between threads and data. Both data and work distributuions are first-class objects of the model, supporting the dynamic management of data and threads in memory. This offers the flexibility required for fully exploiting the processing power and memory bandwidth of a PIM array, in particular for irregular and adaptive applications. Thread synchronization is based on atomic methods, condition variables, and futures. A special type of lightweight macroserver allows the formulation of flexible scheduling strategies for the access to resources, using a monitor-like mechanism

CiteSeerX

Caltech Authors

User-Defined Data Distributions in High-Level Programming Languages

Author: Diaconescu Roxana E.
Zima Hans P.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

One of the characteristic features of today’s high performance computing systems is a physically distributed memory. Efficient management of locality is essential for meeting key performance requirements for these architectures. The standard technique for dealing with this issue has involved the extension of traditional sequential programming languages with explicit message passing, in the context of a processor-centric view of parallel computation. This has resulted in complex and error-prone assembly-style codes in which algorithms and communication are inextricably interwoven. This paper presents a high-level approach to the design and implementation of data distributions. Our work is motivated by the need to improve the current parallel programming methodology by introducing a paradigm supporting the development of efficient and reusable parallel code. This approach is currently being implemented in the context of a new programming language called Chapel, which is designed in the HPCS project Cascade

NASA Technical Reports Server

Caltech Authors

Fault Tolerance Middleware for a Multi-Core System

Author: James Mark
Some Raphael R.
Springer Paul L.
Wagner David A.
Zima Hans P.
Publication venue
Publication date
Field of study

Fault Tolerance Middleware (FTM) provides a framework to run on a dedicated core of a multi-core system and handles detection of single-event upsets (SEUs), and the responses to those SEUs, occurring in an application running on multiple cores of the processor. This software was written expressly for a multi-core system and can support different kinds of fault strategies, such as introspection, algorithm-based fault tolerance (ABFT), and triple modular redundancy (TMR). It focuses on providing fault tolerance for the application code, and represents the first step in a plan to eventually include fault tolerance in message passing and the FTM itself. In the multi-core system, the FTM resides on a single, dedicated core, separate from the cores used by the application. This is done in order to isolate the FTM from application faults and to allow it to swap out any application core for a substitute. The structure of the FTM consists of an interface to a fault tolerant strategy module, a responder module, a fault manager module, an error factory, and an error mapper that determines the severity of the error. In the present reference implementation, the only fault tolerant strategy implemented is introspection. The introspection code waits for an application node to send an error notification to it. It then uses the error factory to create an error object, and at this time, a severity level is assigned to the error. The introspection code uses its built-in knowledge base to generate a recommended response to the error. Responses might include ignoring the error, logging it, rolling back the application to a previously saved checkpoint, swapping in a new node to replace a bad one, or restarting the application. The original error and recommended response are passed to the top-level fault manager module, which invokes the response. The responder module also notifies the introspection module of the generated response. This provides additional information to the introspection module that it can use in generating its next response. For example, if the responder triggers an application rollback and errors are still occurring, the introspection module may decide to recommend an application restart

NASA Technical Reports Server

Human Galectins Induce Conversion of Dermal Fibroblasts into Myofibroblasts and Production of Extracellular Matrix: Potential Application in Tissue Engineering and Wound Repair

Members of the galectin family of endogenous lectins are potent adhesion/growth-regulatory effectors. Their multi-functionality opens possibilities for their use in bioapplications. We studied whether human galectins induce the conversion of human dermal fibroblasts into myofibroblasts (MFBs) and the production of a bioactive extracellular matrix scaffold is suitable for cell culture. Testing a panel of galectins of all three subgroups, including natural and engineered variants, we detected activity for the proto-type galectin-1 and galectin-7, the chimera-type galectin-3 and the tandem-repeat-type galectin-4. The activity of galectin-1 required the integrity of the carbohydrate recognition domain. It was independent of the presence of TGF-beta 1, but it yielded an additive effect. The resulting MFBs, relevant, for example, for tumor progression, generated a matrix scaffold rich in fibronectin and galectin-1 that supported keratinocyte culture without feeder cells. Of note, keratinocytes cultured on this substratum presented a stem-like cell phenotype with small size and keratin-19 expression. In vivo in rats, galectin-1 had a positive effect on skin wound closure 21 days after surgery. In conclusion, we describe the differential potential of certain human galectins to induce the conversion of dermal fibroblasts into MFBs and the generation of a bioactive cell culture substratum. Copyright (C) 2011 S. Karger AG, Base

Crossref

Open Access LMU

From FORTRAN 77 to Locality-Aware High Productivity Languages for Peta-Scale Computing

Author: Hans P. Zima
Publication venue: Hindawi Limited
Publication date: 01/01/2007
Field of study

When the first specification of the FORTRAN language was released in 1956, the goal was to provide an "automatic programming system" that would enhance the economy of programming by replacing assembly language with a notation closer to the domain of scientific programming. A key issue in this context, explicitly recognized by the authors of the language, was the requirement to produce efficient object programs that could compete with their hand-coded counterparts. More than 50 years later, a similar situation exists with respect to finding the right programming paradigm for high performance computing systems. FORTRAN, as the traditional language for scientific programming, has played a major role in the quest for high-productivity programming languages that satisfy very strict performance constraints. This paper focuses on high-level support for locality awareness, one of the most important requirements in this context. The discussion centers on the High Performance Fortran (HPF) family of languages, and their influence on current language developments for peta-scale computing. HPF is a data-parallel language that was designed to provide the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of generating an explicitly parallel message-passing program. We outline developments that led to HPF, explain its major features, identify a set of weaknesses, and discuss subsequent languages that address these problems. The final part of the paper deals with Chapel, a modern object-oriented language developed in the High Productivity Computing Systems (HPCS) program sponsored by DARPA. A salient property of Chapel is its general framework for the support of user-defined distributions, which is related in many ways to ideas first described in Vienna Fortran. This framework is general enough to allow a concise specification of sparse data distributions. The paper concludes with an outlook to future research in this area

Directory of Open Access Journals

Session B4: VLSI synthesis

Author: Hans P. Zima
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Michail Bachtins Konzeption als Alternative zum Sozialistischen Realismus

Author: Günther Hans
Zima P. V.
Publication venue: 'John Benjamins Publishing Company'
Publication date: 01/01/1981
Field of study

Günther H. Michail Bachtins Konzeption als Alternative zum Sozialistischen Realismus. In: Zima PV, ed. Semiotics and Dialectics. Linguistic & literary studies in Eastern Europe. Vol 5. Amsterdam: Benjamins; 1981: 137-177

Publications at Bielefeld University

A Static Parameter based Performance Prediction Tool for Parallel Programs

Author: Hans P. Zima
The P
Thomas Fahringer
Publication venue
Publication date: 01/01/1993
Field of study

This paper presents a Parameter based Performance Prediction Tool (P 3 T ) which is part of the Vienna Fortran Compilation System (VFCS), a compiler that automatically translates Fortran programs into message passing programs for massively parallel architectures. The P 3 T is applied to an explicitly parallel program generated by the VFCS, which may contain synchronous as well as asynchronous communication and is attributed with parameters computed in a previous profiling run. It statically computes a set of optional parameters that characterize the behavior of the parallel program. This includes work distribution, the number of data transfers, the amount of data transferred, transfer times, network contention, and the number of cache misses. These parameters can be selectively determined for statements, loops, procedures, and the entire program; furthermore, their effect with respect to individual processors can be examined. The tool plays an important role in the VFCS by providin..

CiteSeerX